A framework for robust MFCC feature extraction using SNR-dependent compression of enhanced mel filter bank energies
نویسندگان
چکیده
The Mel-frequency cepstral coefficients (MFCC) are most widely used and successful features for speech recognition. But, their performance degrades in presence of additive noise. In this paper, we propose a noise compensation method for Mel filter bank energies and so MFCC features. This compensation method includes two steps: Mel sub-band spectral subtraction and then compression of Mel-Sub-band energies. In the compression step, we propose a sub-band SNR-dependent compression function. We use this function instead of logarithm function in conventional MFCC feature extraction in presence of additive noise. Experimental results show that the proposed method significantly improves MFCC features performance in noisy conditions where it decreases word error rate about 70% in SNR value of 0 dB for different types of additive noise.
منابع مشابه
Mel sub-band filtering and compression for robust speech recognition
The Mel-frequency cepstral coefficients (MFCC) are commonly used in speech recognition systems. But, they are high sensitive to presence of external noise. In this paper, we propose a noise compensation method for Mel filter bank energies and so MFCC features. This compensation method is performed in two stages: Mel sub-band filtering and then compression of Mel-sub-band energies. In the compre...
متن کاملClass-Dependent PCA Optimization Using Genetic Programming for Robust MFCC Extraction
Principal component analysis (PCA) is commonly used in feature extraction. It projects the features in direction of maximum variance. This projection can be performed in a class-dependent or class-independent manner. In this paper, we propose to optimize class-dependent PCA transformation matrix for robust MFCC feature extraction using genetic programming. For this purpose, we first map logarit...
متن کاملGenetic programming based optimization of class-dependent PCA for extracting robust MFCC
Principal component analysis (PCA) is commonly used in feature extraction. It projects the features in direction of maximum variance. This projection can be performed in a class-dependent or class-independent manner. In this paper, we propose to optimize class-dependent PCA transformation matrix for robust MFCC feature extraction using genetic programming. For this purpose, we first map logarit...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملFilter Bank Feature Extraction for Gaussian Mixture Model Speaker Recognition
Speaker Recognition is the task of identifying an individual from their voice. Typically this task is performed in two consecutive stages: feature extraction and classification. Using a Gaussian Mixture Model (GMM) classifier different filter-bank configurations were compared as feature extraction techniques for speaker recognition. The filter-banks were also compared to the popular Mel-Frequen...
متن کامل